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PARALLEL PROCESSORS FOR PLANNING 
UNDER UNCERTAINTY 


George B. Dantzig & Peter W. Glynn 


ABSTRACT 

In this paper we describe,joint research under way by Mordecai Avriel, Robert Entriken, and the authors. 
Qu^goal is to demonstrate, for an important class of multistage stochastic models, that a variety of techniques 
for solving large-scale linear programs can be effectively mixed to attack this fundamental problem. The 
ideas involve nested primal and dual decomposition, combined with Monte Carlo simulation, high speed 
importance sampling, and quadrature methods for numerical integration, together with the use parallel 
processors. 
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1. Hedging Against Uncertainty 

A long outstanding problem of great practical importance concerns finding an efficient way to 
do planning, scheduling, and control of complex systems under uncertainty. Although progress has 
been made on this fundamental problem of operations research, control theory, and economics by 
Roger Wets, John Birge, and others, it remains in general unsolved. 

We first state the deterministic version of the problem and then generalize it to include two 
important characteristics of stochastic problems encountered in practice which we will refer to as 
intra-period and inter-period uncertainty. Mathematical programs are used for planning, schedul¬ 
ing, and optimal design of large-scale complex systems. Applications include models used for 
strategic planning, policy decisions to guide the erowth of the economy, scheduling production and 
expansion of large-scale industrial enterprises such as those that generate and distribute electricity, 
water, fuel, or produce agricultural products. Many such models have thousands of variables and 
equations. These models are mostly deterministic. Unfortunately, the solutions of deterministic 
models are often not taken seriously because they do not properly hedge against future contingen¬ 
cies. 

While it is relatively easy to reformulate deterministic models to take account of uncertainties, 
the rub has been that for complex time-staged systems the model size increases exponentially with 
the number of stages. This has made them too expensive to solve. [10,13] 

A variety of heuristic devices are used in practice to adjust deterministic solutions so that they 
hedge. Scenario analysis is one popular way to do this. Several different scenarios, are computed 
(usually only five or six), the results are compared, and a compromise solution somehow cr other 
is arrived at empirically. 

Birge has developed clever ways to arrive at approximate solutions to stochastic programs and 
ways to estimate the quality of his approximations. [4,5,6]. 

Our approach is more direct. Many scenarios are run and used to arrive at a compromise 
solution that hedges against uncertainties. The sample space of all possible scenarios, could be 
continuous, or could run into millions of discrete points. For many problems, it is reasonable to 
consider solving thousands of sample scenarios which are used as input data for generating the 
hedging solution. Since these sample scenarios are independently drawn, it is easy to see why 
parallel processors are ideal for efficiently carrying out such computations. 

One class of deterministic models, which we generalize to the stochastic case are the time-staged 
linear programming models whose matrix structure is lower block triangular, [11,12,14]. Other basic 
references are [2,10,11,15,33], See also [18,26,28,29,31,32]. By introduction of in-process inventories 
and other devices, this class can be reduced to the mathematically equivalent “staircase” problems 
of the form: 
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FIND min Z and vectors X t > 0 , such that 


b 1 = A\X\ 

b 2 ~ — B\X\ -i-A 2 X 2 


bt = — Bt-iXt-i +AtXt 

b-r = — Bx—iXx—1 ~\~AtXt 

(min) Z — ciX 1 + • • • + CfXt 4- • ■ • -j-cxX 3 * 

where matrices At, B t and vectors b t ,c t are given. 

Suppose one of several contingencies (events) can happen in the second period so that 62 , Bi, 
and A 2 are not known with certainty. We index the possible events by w = 1,2 ,K and assume 
that p(w), the probability of the event w is known. Then in place of the second relation 


b 2 — — B\X\ + ^ 2^2 


we have many relations of the form 

62 =-Bi(l)*! +A 2 (l)X 2 (l) 

^(w) = — Bi(u>)Xi + A 2 (w)X 2 (u> ) 

b 2 (K) = -B 1 (K)X 1 +A 2 {K)X 2 (K) 

If there are no further contingencies after the second period, then associated with each X 2 (w) will 
be a system of relations associated with it of exactly the same form as those below X 2 above except 
variables Xt for t > 2 are replaced by X t (w). In the objective equation, the terms c t X t are replaced 
by their expected values, c ( £p(w)X*(w). 

In general, however, there will be contingencies happening in every period. The “event tree” 
of contingent events in this more general case has the form 
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The number of branches associated with each node can be finite or infinite. It is obvious why 
the size of the system can grow exponentially with the number of stages. Moreover, even if a 
problem has only two stages, there cm be a large or infinite number of possible contingencies in 
the second stage. Some researchers, like Roger Wets, [33,34,35] have concentrated on the two stage 
case because it is an important problem in its own right and because it can be used as a stepping 
stone for finding solutions to the multi-stage case for certain classes of problems as we will soon 
see. 


2. The Intra-Period Stochastic Submodel 

The general stochastic programming appears to us to be intractable given the present state of 
the art. We, therefore, have been concentrating on classes of models that are relevant and whose 
event tree does not grow exponentially with the number of time periods being modelled, [13]. Here 
are some examples : 

Typically, industries use their facilities to carry out operations. Thus an airline has a fleet 
of airplanes of different kinds, and has other facilities for handling passengers on the ground and 
repairing aircraft. Their operations consist of flying aircraft, maintaining them, and serving passen¬ 
gers on the ground. In the case of an electric utility, it has facilities for generating and distributing 
electricity (dams, generators using nuclear fuel, fossil fuel, water power, and transmission lines). 
Operationally, these facilities are used to generate and distribute electricity. [16] 

Planning models for such industries may be essentially deterministic as far as their plans 
for expansion of facilities are concerned. In the airline example, the deterministic part are the 
schedules for purchase or retirement of aircraft in the fleet and the expansion of ground facilities. 
Their operations, however, must be is modelsd in a stochastic way in order to be sure the facilities 
on hand are sufficient to take care of various contingencies that might arise in day-to-day operations. 
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What allows us to decompose the problem into a deterministic part and uncertainty part is the 
assumption that, whatever be the contingencies that arise in day to day operations, the facilities 
will not be destroyed in the process of using them for operations. (A situation in which this might 
appear not be true would be an aircraft being used for operations which is destroyed by an accident. 
This contingency, however, can be modeled so as not to affect the future state of facilities if there 
is insurance to cover the loss.) Models of this type for two periods have the form: 

Let w = 1,2,... and u> = 1,2,... be independently drawn random variables. 

FIND min Z, X t > 0, (w) > 0, U 2 (u) > 0 such that: 


bi — A\X\ 

* * * di(w) = —D\((jj)X i + Fi(oj)Ui((jj) 


6 2 = 

~B 1 X 1 

+ A 2 X 2 

* * * = 


—D 2 ( u>)X 2 

min Z = 

ciXi 

+ c 2-^2 


+ F 2 (u,)U 2 (u>) 


+ £ u pi(<*0si( w ) u i(“) + £ u P2(w)s2(w)iMw) 


The probabilities that w and Q occur are given by P2(w),P2(w). The objective minimizes 
expected costs. The Intra-period Submodels correspond to the set of equations marked by *** 
above, one for each period. The remaining equations constitute the “Deterministic” part of the 

systrm. 

Benders decomposition is an ideal way to solve such a model as we shall soon see, [3,17], 
Under this approach, the submodels are solved with X\ and X 2 temporarily fixed at some valuer, 
X\ = Xj and X 2 = X% ■ Note that, when X\ and X 2 are known, the submodels decompose into 
many, many small independent subproblems, one for each value of w = 1,2 ,..., and ui = 1,2,..., 
namely 

FIND minji(w)£fi(w), t/i(w) > 0, such that fi(w)£/i(w) = di(w) + ; 

and for each w= 1,2 ,..., namely 

FIND min j2(w)^a(w), U 2 (u>) > 0, such that F 2 {u>)U 2 {w) = d 2 {u>) + D 2 {w)X^ . 

Parallel processors can be effectively used to solve these problems wholesale for all choices of w 
and w when thre are not too many values of w and w. When thre are too many w and w, then a 
“representative” sample is used instead. 


3. The Inter-Period Stochastic Part of the model (Ideas in this section are due to Moredecai 
Avriel). 

We now consider an equally relevant class of models where the contingencies that arise in one 
period affect later periods. Suppose in one year the demand for some item is high and this higher 
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demand 


it more likely that demand in the following year will also be higher. The simplest 
me in which demand d» + 1 in period t + 1 is related to the demand in period t by 


case wou 


where A > p. 


dt+1 


A d< with probability a 

ndt with probability 1 - a = /9 



In this simple example, the number of cases at time t will be 2*, i.e., the size of the problem is 
growing exponentially in t. However, because A/zdj = fiXdi, there is a consolidation in the number 
of cases so that in fact, the number of cases is only growing proportional to t. 

Even though a low demand in year t - 2 followed by a high demand in year t - 1 might arrive 
with the same probability of an intermediate demand in period t as a high demand in period t - 2 
followed by a low demand in period t - 1, this does not mean we can equate other components 
in the state space. For models having a more complex structure, the state at time t of the other 
components of the system could be quite different depending on the history of their past states. 
Mordecai Avriel of our research team has proposed a way to reduce the number of cases in general 
by forcing the consolidation of the case HL generated by a high (= H) followed by a low (= L) year 
and the case LH generated by a low (= L) followed by high (= H) year. He recommends averaging 
the other components of their state vector at time t. This is illustrated below for a three period 
model: 


6 





FIND min Z , (X ,, Xf , X ", X^ L , X % H , X» H ) > 0: 

= br 
*** 

*** 

= b“ 

*** 

= 6 ^ 

*** 

*** 

+A""X 3 "" = 6™ 

*** *** 

dXj +(0)c^*f -N* 2 ^*^ +2apc% H X£ H +0 2 cg H X? H = Z(min) 

The asterisks *** above indicate that the intra-period stochastic constraints have been omitted. 
The first and second set of omitted relations are: 

di(uj) = —Di(w)Xi + Fi(w)l7i(u>), uj = 1,2,... 

4(«) = -DJ(w)x£ -(- F£(Q)U£{w), u> = 1,2,... 

4. Solving the Inter-period part of the Stochastic Model 

Without the intra-period constraints, when the number of time periods is small. it mav be 
practical to use standard linear programming software to directly solve the model. A model having 
too many *** constraints may nevertheless be tractable if we replace operating constraints by “cuts” 
generated when Benders decomposition is used to find tentative solutions to the subproblems, [3]. 
These cuts, for example, are many inequalities of the form 

£ jr!(w)di(w) < -£x 1 (w)Di(w)A'j + 0 it w = 1,2 ,... 

£(G))di[Q) < (ui)D 2 {u>)Xi + Off, Q -1,2,... 

When the size of the inter-period part of the model becomes too large to solve directly using 
standard linear programming software, it is plumed to use nested Benders decomposition software. 
Such software has already been developed by Robert Entriken for solving staircase systems by 
modifying the MINOS linear programming code, [19,23,24,25], 

To apply Binders decomposition, the model is partitioned by columns — see dotted vertical 
lines [1], The first “Master” corresponds to the columns in the first period. The Master assigns 


AX i 
*** 

- B L X x +A%X% 

*** 

-B?X j +A%X% 

*** 

-b% l x£ +a£ l x£ l 

*** 

1 uLH yL Id HLyH . aLH 

~ 2 “ 2 ^2 r/t3 ^3 

*** 

-B**X« 
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a tentative value to X\, say — Xj. The subproblem corresponds to the remaining columns 
and is solved assuming that X\ = X{ is specified and terms -B^Xl and B 2 X i have been added 
to the right hand side. This subproblem can, in turn, be partitioned into a Master and Sub with 
the Master corresponding to variables X 2 , X 2 . (The earlier consolidation of two states in period 
3 complicates the discussion which follows.) To simplify matters assume, instead, either the term 
I ?2 H X% or B 2 L X% is used to represent the LH or HL states but not both. If this is done the 
Benders subproblem at each stage will turn out to be a set of smaller problems to be iteratively 
solved, of which the following is typical; 

^b^ + BiX;, *2 > o 

CUTS - G e 2 X£ + 9 > g e 2 , £=1,2,... 

C 2 X 2 + 9 — min 

The theory of how such cuts are generated and how information is passed back and forth to other 
subproblems will be outlined next. 

4.0. Solving the Intra-Period Stochastic Model 

We begin with the simplest two-stage case first studied in Dantzig [10] and developed by R. 
Wets [33,34,35]: 

b 1 = A l X l , (X a ,X 2 )> 0 , 

6 2 = —BjXi A 1 X 2 

(min) Z — cjXi +C 2 X 2 (3) 

where the first stage (i^Ai.Ci) are known with certainty while the second stage (& 2 , C 2 , Pi, A?) is 
assumed to be functions of a random variable w with known probability distribution p(w),w 6 fi. 
The values of u> in O may have a continuum of values ir. which case p 2 (w) is a probability density 
distribution; or w may take on a finite or an infinite set of discrete values, in which case ( 0 .) is 
a discrete probability distribution where oj = 1,2 K where K may be infinite. When applied 
to the intra-period submodel discussed earlier, the equation fc 2 = -B^Xj ■+- / 1 2 X 2 is replaced by 
di(uj) = - D\[ui)X\ + Fi(w)f/ 1 (w) with corresponding changes in the objective form; 

Let fl be the sample space of w. For the purposes of the computational approach outlined in 
this paper, we require that fl be discrete with a finite number of elements. Practically speaking, 
this is no restriction since any distribution may be approximated by a probability mass func¬ 
tion concentrated on a finite set of points. Then, assuming we label the sample points w us¬ 
ing the integers { 1 , 2 ,.the random vectors and matrices (& 2 , c 2 , B \, A 2 ) takes on the value 
( 6 2 (w),C 2 (w), £?i(w), A 2 (w)), (1 < w < K) with known probability p 2 (w). 

We now illustrate the approach for K = 2. The stochastic problem of minimizing expected 
costs under uncertainty then has as its certainty equivalent the deterministic linear program that 
we outlined earlier except now we describe the computational method in greater detail: 


Find min Z, X\ > 0, .X 2 (w) > 0, ui = 1,2,3 : (4.0) 

= AjJfi (41) 

6a(l)= -Bi(I)*i + A 2 (1)X 2 (1) (42) 

6 a (2)= -S 1 (2)X 1 + A 2 (2)X 2 (2) 

6 a (3)- -5,(3)^ + A 2 (3)X 2 (3) 

min Z- CjA’i +p 2 (l) c 2 (l)X 2 (l) +p 2 (2)c 2 (2)X 2 (2) + p 2 (3)c 2 (3)X" 2 (3) (4.3) 


To simplify the discussion, assume a bounded optimal solution exists. It follows that we can 
always find tt 2 (w) to premultiply constraints corresponding to 6 2 (w) above and subtract from the 
objective so that adjusted c 2 (tu) > 0, Therefore we can assume without loss of generality c 2 (w) > 0. 
Except as noted otherwise, we will assume B\ is independent of u>, i.e., B a = B\(ui) for all u. 

Typically, as we have already noted, this problem is solved using “Benders’" decomposition, 
see [3]. The key idea is to replace the contribution of the second period variables to the objective 
function by a scalar 0 2 , and to replace the second period constraints — those shown in (4.2) between 
the dashed lines — by a set of inequalities expressed in terms of X t and 0 2 only, called “cuts’". These 
are necessary conditions which are satisfied by all feasible and optimal solutions to (4). These cuts, 
are added sequentially (£ 1,2, ) to the first period problem, A).\j — 6],A'i > 0. And these, 

together with a modified objective Z = C\Xi + 6 constitute the “Restricted MASTER Problem’ 
whose Z is a lower bound estimate for min Z of (4.1) ••• (4.3). Cuts are added to the Master 

until they become sufficient to solve (4). This happens when the current value of the objective Z 

for a feasible solution to (4) equals the lower bound estimate of min Z. In p>actice the iterative 
process is stopped when this difference is judged to be “small enough”. Cuts come in two “flavors”: 
feasibility cuts and optimality cut”. The “MASTER” problem for Benders’ decomposition method 
has the form: 

FIND min Z, Ai > 0, 0 2 > 0 : 

$1 — Ai Xi, (5-0 

CUTS: g[ < -G^Xi +S*e 2 , (5.2) 

mm if = cj Ai + 0 2 (5.3) 

where — 0 for feasibility cuts if the subproblem from which it was derived is infeasible, and 
$2 = 1 for optimality cuts if the subproblem (6) below is feasible. The optimal solution Xi = A', 
to (5.0) ••• (5.3) is the value of Xi that i? temporarily specified and passed tc the subproblem 
where it ; s “tested” to see if it qualifies as the first period component of some optimal solution 
[Xi,X 2 (1),X 2 (2), X 2 (3)J for (4). This is done by solving the set of subproblems (6) below to see 
(i) if the contribution B\X{ from the first period implies for the second period a feasible solution 
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for every choice of w, and (ii) if it together with the set of optimal solutions to the second period 
for every u> provides a global optimum to the original problem. Global optimality is easily tested 
by checking whether the lower bound estimate for min Z is equal to the value of Z for the current 
feasible solution. If the answer to (i) or (ii) is negative, the optimal x 2 (w) to (6) is substituted in 
formula (7.1) or (7.2) below in order to generate cut L + 1 which is then added to the L already 
generated in (5.2). It can be proved that the current optimal solution of the Master violates the cut 
condition and therefore the next optimal solution to the Master will generate an improved lower 
for Z, [3], 


4.1 The Sub Sub Problem 

For each ui in f2, FIND min Z 2 (tu), .X 2 (w) > 0: 

Dual Prices 

(<u)..Y 2 (w) — f> 2 ( w ) "b B\X^ • ^2( tc ’) 

p 2 (w) • c 2 (w)X 2 (w) = J 2 (w)(min) (6) 


where u> — {1 K}. These problems are solved for oj = 1,... , A' and their optima! dual “prices” 
(if (6)is feasible), or “infeasibility” prices (if not feasible) are computed and used as follows: If any 
subproblem w is infeasible, its infeasibiiity prices ate used to generate a “feasibility” cut (7.1) below 
with 6j +1 - 0 

</( +1 = ir 2 (u*)b 2 (u); G' +1 =X 2 (v)B l (u) . (7.1) 

If feasible for all oj 6 D, then X j is tested for optimality by comparing the lower bound estimate 
of 9 from the master problem with Z 2 (w). If the test fails the expected values: 

iu +1 = £*3(w)M«); g i + 1 = ]C»a(«)*iM • (7 2) 

UI OJ 

are used to generate new “optimality” cut conditions to augment those of (5.2) with ^ +1 ~ 1. 
Note that (7.2) are actually expected values because ir 2 (w) as defined by (6) is proportional to the 
probabilities p 2 (w). 


5. The Concept of Reliable Systems 

The stochastic operations submodels above have been formulated so that facilities made avail¬ 
able for day-to-day operations are always sufficient to meet the demands on the system whatever 
be the contingeny ui. Formulating the model this way can make the facilities required too costly tc 
build. Instead of requiring the system to be always feasible, it is often formulated to be reliable , 
i.e., feasible most of the time. 

Conditions that place an upper bound on the allowed frequency of failure to meet demand 
turn out to be non-convex when expressed in terms of the usual variables representing the levels 
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of operations and therefore cannot be approximated satisfactory in a linear programming context, 
[16]. However, conditions that measure the expected amount of demand not satisfied are easy to 
express linearly, one such constraint being added to each stage. The submodels for each state for 
oj = 1,2,... will no longer be independent. The way to restore independence is to make these extra 
constraints correspond to a “Super Master” (in the Dantzig-Wolfe primal decomposition sense), 
The Super Master systematically assigns penalty weights to the extra conditions and these are 
used to modify the objective. If this approach is adopted, independence of the subproblems is 
restored; the subproblems are reformulated so that they are always feasible whatever be ui, [11]. 

6. Using Parallel Processors 

The decomposition algorithm, however, is clearly only practiced when K is small. When K is 
large, it is proposed that parallel processors be used as high-speed sampling or quadrature devices 
to effectively solve the subproblems. One idea is to have a processor at the MASTER level serve as 
an integrator which sequentially receives a3 input estimates of the cuts (5.2). The Master Problem 
is then solved to optimality with the estimates it has received so far and used to generate as output, 
revised X x = X' that are sent to other parallel processors which are busy solving (6) for various 
choices of u>. This process also provides a lower bound estimate for min Z which monotonically 
increases with each solution of the master problem. 

The amount of space needed to store the generated cuts in the computer memory need not 
be high. Assuming B x is independent of w, no more than L < m 2 of the cuts will be tight on any 
major iteration, where m 2 is the number of rows in B x . This is so because G \, generated by linear 
combinations of the rows of B x , has rank < r where r < m 2 is the rank of B x . The remainder may 
be dropped (possibly to be regenerated on some later iteration). 

Several parallel processors could be at the SUB level, each having as input the latest value of 
A'] 1 and solving (6) in dual form for many random or stratified choices of u/. When c 2 , A 2 are the 
same for all w, the dual of (6) is a linear program with only the dual objective fc 2 ( w ) changing. By 
judiciously stratifying the random sampling of f 1 we hope to use the optimal basic dual feasible 
solution for one w to find quickly the optimal one for the next u>. To provide cuts for the MASTER, 
the parallel processors are to be U3ed to determine the expected values j( +1 and G j +1 defined by 
(7.2) or to approximate them by means of a large enough “importance sample”, see Section 7. 

If it is practical to solve (6) for all w, the set of solutions to (6) generates a valid cut and a 
correct lower bound estimate for minZ. In that case, the difference between the lower bound and 
upper bound estimates can then be used to test optimality of Xf for the original problem. When 
Z, according to some specified tolerance, is close enough to the lower bound estimate for min Z the 
iterative process is stopped and A* declared “optimal”. 
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7. Importance Sampling 

For the cases where K is large, it is no longer possible to solve (6) for all u>. Instead, we 
propose to use random sampling to choose a set of oj’s for which (6) will be solved. [20,21] This 
solution strategy will require 

a) the development of an efficient sampling plan, 

b) the development of an efficient stopping rule. 

By a), we refer to the fact that naive sampling as a computational tool, will tend to be inefficient 
in the sense that a large number of w’s will typically be needed to obtain a reasonable degree of 
solution accuracy. The reason why this is so for the class of applications we have in mind is that 
certain w’b play a particularly important role in the solution. For example, in an electric utility 
capacity planning problem, the w’s corresponding to generator or transmission line failure, while 
comprising only a small portion of the total sample space w, are significant enough conting ncies to 
force the utility to “hedge”. Hence, it is important to design sampling schemes which concentrate 
an appropriate level of computational effort on these “rare” cu’s. We will use two basic ideas, from 
Monte Carlo simulation, to accomplish this task: stratification and importance sampling. [7,22,27], 

In stratification, one pre-assigns a certain proportion of the total sample to each of (say) m 
subsets partitioning the sample space fi. This increases the efficiency of the sampling procedure 
by reducing the clustering effects typical of a conventional sampling scheme. For example, in naive 
random sampling, the entire sample could (with small probability, of course) fall into one subset. 
There is also a variant of stratification which we will be considering, called pre-stratification, that 
is easier to program, see Cochran, W.G. [9], 

The second concept that we shall exploit is importance sampling. Within each subset of the 
stratification partition, we can design our sampling procedure so that we sample not according to 
the original probability mass function (or, more precisely, the original mass function conditioned on 
w belonging to the particular subset), but rather according to a mass function which assigns more 
weight to the “important” elements of the sample space. By “important”, we mean those elements 
which will contribute significantly to the average value of the dual variables. The estimator needs 
to be appropriately adjusted to account for the new sampling mechanism, but this is easily done, 
see Hammersley and Handscomb [22], We intend to use both theory and exploratory data analysis 
to guide us in developing efficient importance sampling-algorithms. 

As for problem (6) described above, we will need to develop a stopping rule (hopefully sequen¬ 
tial) which meshes appropriately with the mathematical programming ideas described elsewhere. 
Specifically, the stopping rule should ensure that a sufficient accuracy is obtained at each iteration 
of the sampling procedure so as to impart useful information to the optimization loop of the routine. 
We expect the basic structure of the stopping rule to be of Chow-Robbins type, see [8]. 

The above sampling ideas should prove to have powerful applications in the optimization con- 
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text of interest here. We believe in this approach to be fundamental for two reasons, one historical 
and the other prospective. First, the dimension of the sample space over which expectations need 
to be computed usually is huge. Monte Carlo methods are easy to extend to the parallel com¬ 
puting environment, and the speed-ups are significant, [30]. The reason, of course, is that Monte 
Carlo methods are based on replication and replication is trivial to distribute over many parallel 
processors. For both the above reasons, we believe that Monte Carlo ideas, in conjunction with 
the mathematical programming concepts developed for solving large-scale systemson main frames, 
form a promising avenue for the development of efficient solution algorithms for complex stochastic 
optimization problems. 
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